This employee promotion data tells about the specifications of employees who get promotions and those who don't.
The information of the dataset:
employee_id : Unique ID for employee.department : Department of employeeregion : Region of employment (unordered)education : Education Level.gender : Gender of Employee.recruitment_channel : Channel of recruitment for employees.no_of_trainings : no. of other training completed in previous year on soft skills, technical skills etc.date_of_birth: employee date of birthage : Age of Employee.join_date: Employee join dateprevious_year_rating : Employee Rating for the previous year.length_of_service : Length of service in years.KPIs_met >80% : If Percent of KPIs(Key performance Indicators) >80% then 1 else 0.awards_won? : If awards won during the previous year then 1 else 0.avg_training_score : Average score in current training evaluations.is_promoted : Recommended for promotion.from dash import Dash, html, dcc, Input, Output
import plotly.express as px
import pandas as pd
import dash_bootstrap_components as dbc
promotion = pd.read_csv('promotion_clean.csv')
promotion[['department','region','education',
'gender','recruitment_channel',
'KPIs_met >80%','awards_won?',
'is_promoted']] = promotion[['department','region',
'education','gender',
'recruitment_channel',
'KPIs_met >80%','awards_won?',
'is_promoted']].astype('category')
promotion[['date_of_birth','join_date']] = promotion[['date_of_birth','join_date']].astype('datetime64')
This is the information of employeed in our Start-Up. help to identify who is a potential candidate for promotion
promotion.shape[0]
54808
promotion[promotion['is_promoted']=='Yes'].shape[0]
4668
Number of employee in each department each promotion status
data_agg = promotion.groupby(['department','is_promoted']).count()[['employee_id']].reset_index()
data_agg = data_agg.sort_values(by = 'employee_id')
bar_plot1 = px.bar(
data_agg,
x = 'employee_id',
y = 'department',
color = 'is_promoted',
color_discrete_sequence = ['#618685','#80ced6'],
barmode = 'group',
orientation='h',
template = 'ggplot2',
labels = {
'department': 'Department',
'employee_id': 'No of Employee',
'is_promoted': 'Is Promoted?',
},
title = 'Number of employees in each department',
height=700,
).update_layout(showlegend=False)
bar_plot1
We gain the insight that Sales & Marketing department have the highest proportion of promotion relative to the number of employee inside that department
data_2020 = promotion[promotion['join_date'] >= '2020-01-01']
data_2020 = data_2020.groupby(['join_date']).count()['employee_id'].reset_index().tail(30)
line_plot2 = px.line(
data_2020,
x='join_date',
y='employee_id',
markers=True,
color_discrete_sequence = ['#618685'],
template = 'ggplot2',
labels={
'join_date':'Join date',
'employee_id':'Number of employee'
},
title = 'Number of new hires in the last 30 days',
height=700,
)
line_plot2
promotion['department'].unique()
['Sales & Marketing', 'Operations', 'Technology', 'Analytics', 'R&D', 'Procurement', 'Finance', 'HR', 'Legal'] Categories (9, object): ['Analytics', 'Finance', 'HR', 'Legal', ..., 'Procurement', 'R&D', 'Sales & Marketing', 'Technology']
data_agg = promotion[promotion['department'] == 'Technology']
hist_plot3 = px.histogram(
data_agg,
x = 'length_of_service',
nbins = 20,
color_discrete_sequence = ['#618685','#80ced6'],
title = 'Length of Service Distribution in Technology Department',
template = 'ggplot2',
labels={
'length_of_service': 'Length of Service (years)',
},
marginal = 'box',
height=700,
)
hist_plot3
